Billion-Scale Pretraining With Vision Transformers For Multi-Task Visual Representations